Automatic Publication Data
نویسندگان
چکیده
In many universities it would be useful to have a database of publications that reflects the research results of the academic staffs. Such a database can be built by automatically retrieve publication information from faculties’ homepage. In this project, we deploy focused crawling to build such a system. We also proposed a new focused crawling heuristics based on URL classification. We compare the performance of our proposed method with breadth first crawling and a variant of context focused crawling. Experiment results show that our new heuristics can find target page faster, avoid irrelevant page better, outperforms other crawling methods. Subject Descriptors: H3.1 Content Analysis and Indexing H.3.3 Information Search and Retrieval I.2.7 Natural Language Processing I.2.8 Problem Solving, Control Methods, and Search
منابع مشابه
Metadata Enrichment for Automatic Data Entry Based on Relational Data Models
The idea of automatic generation of data entry forms based on data relational models is a common and known idea that has been discussed day by day more than before according to the popularity of agile methods in software development accompanying development of programming tools. One of the requirements of the automation methods, whether in commercial products or the relevant research projects, ...
متن کاملPredicting of Students' Anxiety on the basis of Emotional Regulation Difficulties and Negative Automatic Thoughts
Introduction: Anxiety is a psychological disorder, which cognition of its causes is essential. The aim of this study was to examine of emotional regulation difficulties and negative automatic thoughts in the prediction of students' anxiety Islamic Azad University, Bukan Branch. Methods: The method used is descriptive- correlation. The statistical population of this study includes all of college...
متن کاملAN-EUL method for automatic interpretation of potential field data in unexploded ordnances (UXO) detection
We have applied an automatic interpretation method of potential data called AN-EUL in unexploded ordnance (UXO) prospective which is indeed a combination of the analytic signal and the Euler deconvolution approaches. The method can be applied for both magnetic and gravity data as well for gradient surveys based upon the concept of the structural index (SI) of a potential anomaly which is relate...
متن کاملA survey on Automatic Text Summarization
Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...
متن کاملFuzzy Clustering Approach Using Data Fusion Theory and its Application To Automatic Isolated Word Recognition
In this paper, utilization of clustering algorithms for data fusion in decision level is proposed. The results of automatic isolated word recognition, which are derived from speech spectrograph and Linear Predictive Coding (LPC) analysis, are combined with each other by using fuzzy clustering algorithms, especially fuzzy k-means and fuzzy vector quantization. Experimental results show that the...
متن کاملPublication Ethics: A Case Series with Recommendations According to Committee on Publication Ethics (COPE)
Ethical misconduct is not a new issue in the history of science and literature. However, ethical misconducts in science have grown considerably in the modern era which is due to emphasis on the scientific proliferation in research institutes and gauging scientists according to their publications. In the current case series, several misconducts occurring over the previous years in Mashhad Univer...
متن کامل